Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 8190 |
| Missing cells | 24040 |
| Missing cells (%) | 24.5% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.2 MiB |
| Average record size in memory | 148.0 B |
Variable types
| NUM | 10 |
|---|---|
| BOOL | 1 |
| CAT | 1 |
Reproduction
| Analysis started | 2020-07-30 22:35:58.290523 |
|---|---|
| Analysis finished | 2020-07-30 22:36:17.060131 |
| Duration | 18.77 seconds |
| Version | pandas-profiling v2.7.1 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
Date has a high cardinality: 182 distinct values | High cardinality |
MarkDown1 has 4158 (50.8%) missing values | Missing |
MarkDown2 has 5269 (64.3%) missing values | Missing |
MarkDown3 has 4577 (55.9%) missing values | Missing |
MarkDown4 has 4726 (57.7%) missing values | Missing |
MarkDown5 has 4140 (50.5%) missing values | Missing |
CPI has 585 (7.1%) missing values | Missing |
Unemployment has 585 (7.1%) missing values | Missing |
MarkDown5 is highly skewed (γ1 = 50.2778242) | Skewed |
Date is uniformly distributed | Uniform |
Store
Real number (ℝ≥0)
| Distinct count | 45 |
|---|---|
| Unique (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 23.0 |
|---|---|
| Minimum | 1 |
| Maximum | 45 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 64.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 12 |
| median | 23 |
| Q3 | 34 |
| 95-th percentile | 43 |
| Maximum | 45 |
| Range | 44 |
| Interquartile range (IQR) | 22 |
Descriptive statistics
| Standard deviation | 12.9879661 |
|---|---|
| Coefficient of variation (CV) | 0.5646941782 |
| Kurtosis | -1.201186459 |
| Mean | 23 |
| Median Absolute Deviation (MAD) | 11 |
| Skewness | 0 |
| Sum | 188370 |
| Variance | 168.6872634 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 43 | 182 | 2.2% | |
| 41 | 182 | 2.2% | |
| 33 | 182 | 2.2% | |
| 29 | 182 | 2.2% | |
| 25 | 182 | 2.2% | |
| 21 | 182 | 2.2% | |
| 17 | 182 | 2.2% | |
| 13 | 182 | 2.2% | |
| 9 | 182 | 2.2% | |
| 5 | 182 | 2.2% | |
| Other values (35) | 6370 | 77.8% |
| Value | Count | Frequency (%) | |
| 1 | 182 | 2.2% | |
| 2 | 182 | 2.2% | |
| 3 | 182 | 2.2% | |
| 4 | 182 | 2.2% | |
| 5 | 182 | 2.2% |
| Value | Count | Frequency (%) | |
| 45 | 182 | 2.2% | |
| 44 | 182 | 2.2% | |
| 43 | 182 | 2.2% | |
| 42 | 182 | 2.2% | |
| 41 | 182 | 2.2% |
| Distinct count | 182 |
|---|---|
| Unique (%) | 2.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 64.1 KiB |
| 28/01/2011 | 45 |
|---|---|
| 19/11/2010 | 45 |
| 04/06/2010 | 45 |
| 15/07/2011 | 45 |
| 18/05/2012 | 45 |
| Other values (177) |
| Value | Count | Frequency (%) | |
| 28/01/2011 | 45 | 0.5% | |
| 19/11/2010 | 45 | 0.5% | |
| 04/06/2010 | 45 | 0.5% | |
| 15/07/2011 | 45 | 0.5% | |
| 18/05/2012 | 45 | 0.5% | |
| 26/04/2013 | 45 | 0.5% | |
| 05/02/2010 | 45 | 0.5% | |
| 20/08/2010 | 45 | 0.5% | |
| 26/11/2010 | 45 | 0.5% | |
| 31/12/2010 | 45 | 0.5% | |
| Other values (172) | 7740 | 94.5% |
Length
| Max length | 10 |
|---|---|
| Mean length | 10 |
| Min length | 10 |
| Value | Count | Frequency (%) | |
| Decimal_Number | 10 | 90.9% | |
| Other_Punctuation | 1 | 9.1% |
| Value | Count | Frequency (%) | |
| Common | 11 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 11 | 100.0% |
Temperature
Real number (ℝ)
| Distinct count | 4178 |
|---|---|
| Unique (%) | 51.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 59.356197802197805 |
|---|---|
| Minimum | -7.29 |
| Maximum | 101.95 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 64.1 KiB |
Quantile statistics
| Minimum | -7.29 |
|---|---|
| 5-th percentile | 26.849 |
| Q1 | 45.9025 |
| median | 60.71 |
| Q3 | 73.88 |
| 95-th percentile | 87.131 |
| Maximum | 101.95 |
| Range | 109.24 |
| Interquartile range (IQR) | 27.9775 |
Descriptive statistics
| Standard deviation | 18.67860685 |
|---|---|
| Coefficient of variation (CV) | 0.3146867141 |
| Kurtosis | -0.6108838043 |
| Mean | 59.3561978 |
| Median Absolute Deviation (MAD) | 13.995 |
| Skewness | -0.2833843522 |
| Sum | 486127.26 |
| Variance | 348.8903538 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 50.43 | 11 | 0.1% | |
| 70.28 | 11 | 0.1% | |
| 67.87 | 10 | 0.1% | |
| 76.67 | 9 | 0.1% | |
| 72.62 | 9 | 0.1% | |
| 70.87 | 9 | 0.1% | |
| 76.03 | 9 | 0.1% | |
| 53.59 | 8 | 0.1% | |
| 50.81 | 8 | 0.1% | |
| 40.65 | 8 | 0.1% | |
| Other values (4168) | 8098 | 98.9% |
| Value | Count | Frequency (%) | |
| -7.29 | 1 | < 0.1% | |
| -6.61 | 1 | < 0.1% | |
| -6.08 | 1 | < 0.1% | |
| -2.06 | 1 | < 0.1% | |
| 0.25 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 101.95 | 3 | < 0.1% | |
| 100.14 | 1 | < 0.1% | |
| 100.07 | 1 | < 0.1% | |
| 99.66 | 2 | < 0.1% | |
| 99.22 | 3 | < 0.1% |
Fuel_Price
Real number (ℝ≥0)
| Distinct count | 1011 |
|---|---|
| Unique (%) | 12.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.405991819291819 |
|---|---|
| Minimum | 2.472 |
| Maximum | 4.468 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 64.1 KiB |
Quantile statistics
| Minimum | 2.472 |
|---|---|
| 5-th percentile | 2.669 |
| Q1 | 3.041 |
| median | 3.513 |
| Q3 | 3.743 |
| 95-th percentile | 4.021 |
| Maximum | 4.468 |
| Range | 1.996 |
| Interquartile range (IQR) | 0.702 |
Descriptive statistics
| Standard deviation | 0.4313365711 |
|---|---|
| Coefficient of variation (CV) | 0.1266405188 |
| Kurtosis | -0.9523876532 |
| Mean | 3.405991819 |
| Median Absolute Deviation (MAD) | 0.298 |
| Skewness | -0.3050626486 |
| Sum | 27895.073 |
| Variance | 0.1860512376 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 3.417 | 43 | 0.5% | |
| 3.638 | 43 | 0.5% | |
| 3.63 | 40 | 0.5% | |
| 3.583 | 39 | 0.5% | |
| 3.62 | 37 | 0.5% | |
| 3.622 | 31 | 0.4% | |
| 3.524 | 31 | 0.4% | |
| 3.227 | 30 | 0.4% | |
| 3.611 | 30 | 0.4% | |
| 3.666 | 30 | 0.4% | |
| Other values (1001) | 7836 | 95.7% |
| Value | Count | Frequency (%) | |
| 2.472 | 1 | < 0.1% | |
| 2.513 | 1 | < 0.1% | |
| 2.514 | 14 | 0.2% | |
| 2.52 | 1 | < 0.1% | |
| 2.533 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 4.468 | 6 | 0.1% | |
| 4.449 | 6 | 0.1% | |
| 4.308 | 3 | < 0.1% | |
| 4.301 | 6 | 0.1% | |
| 4.294 | 6 | 0.1% |
| Distinct count | 4023 |
|---|---|
| Unique (%) | 99.8% |
| Missing | 4158 |
| Missing (%) | 50.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7032.371785714286 |
|---|---|
| Minimum | -2781.45 |
| Maximum | 103184.98 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 64.1 KiB |
Quantile statistics
| Minimum | -2781.45 |
|---|---|
| 5-th percentile | 109.416 |
| Q1 | 1577.5325 |
| median | 4743.58 |
| Q3 | 8923.31 |
| 95-th percentile | 21500.9325 |
| Maximum | 103184.98 |
| Range | 105966.43 |
| Interquartile range (IQR) | 7345.7775 |
Descriptive statistics
| Standard deviation | 9262.747448 |
|---|---|
| Coefficient of variation (CV) | 1.317158383 |
| Kurtosis | 23.68716731 |
| Mean | 7032.371786 |
| Median Absolute Deviation (MAD) | 3569.965 |
| Skewness | 4.016436305 |
| Sum | 28354523.04 |
| Variance | 85798490.28 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 150.46 | 2 | < 0.1% | |
| 17.01 | 2 | < 0.1% | |
| 2920.43 | 2 | < 0.1% | |
| 460.73 | 2 | < 0.1% | |
| 6510.79 | 2 | < 0.1% | |
| 4855.31 | 2 | < 0.1% | |
| 175.64 | 2 | < 0.1% | |
| 1.5 | 2 | < 0.1% | |
| 8.62 | 2 | < 0.1% | |
| 8940.48 | 1 | < 0.1% | |
| Other values (4013) | 4013 | 49.0% | |
| (Missing) | 4158 | 50.8% |
| Value | Count | Frequency (%) | |
| -2781.45 | 1 | < 0.1% | |
| -772.21 | 1 | < 0.1% | |
| -563.9 | 1 | < 0.1% | |
| -16.93 | 1 | < 0.1% | |
| 0.27 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 103184.98 | 1 | < 0.1% | |
| 95102.5 | 1 | < 0.1% | |
| 88750.34 | 1 | < 0.1% | |
| 88646.76 | 1 | < 0.1% | |
| 84139.36 | 1 | < 0.1% |
| Distinct count | 2715 |
|---|---|
| Unique (%) | 92.9% |
| Missing | 5269 |
| Missing (%) | 64.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3384.1765936323177 |
|---|---|
| Minimum | -265.76 |
| Maximum | 104519.54 |
| Zeros | 3 |
| Zeros (%) | < 0.1% |
| Memory size | 64.1 KiB |
Quantile statistics
| Minimum | -265.76 |
|---|---|
| 5-th percentile | 2.98 |
| Q1 | 68.88 |
| median | 364.57 |
| Q3 | 2153.35 |
| 95-th percentile | 17261.44 |
| Maximum | 104519.54 |
| Range | 104785.3 |
| Interquartile range (IQR) | 2084.47 |
Descriptive statistics
| Standard deviation | 8793.583016 |
|---|---|
| Coefficient of variation (CV) | 2.598440942 |
| Kurtosis | 32.34218663 |
| Mean | 3384.176594 |
| Median Absolute Deviation (MAD) | 355.37 |
| Skewness | 4.962258122 |
| Sum | 9885179.83 |
| Variance | 77327102.25 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 3 | 11 | 0.1% | |
| 1.5 | 10 | 0.1% | |
| 0.5 | 9 | 0.1% | |
| 4 | 9 | 0.1% | |
| 0.03 | 8 | 0.1% | |
| 1.91 | 8 | 0.1% | |
| 6 | 7 | 0.1% | |
| 9 | 5 | 0.1% | |
| 3.82 | 5 | 0.1% | |
| 5.73 | 5 | 0.1% | |
| Other values (2705) | 2844 | 34.7% | |
| (Missing) | 5269 | 64.3% |
| Value | Count | Frequency (%) | |
| -265.76 | 1 | < 0.1% | |
| -192 | 1 | < 0.1% | |
| -35.74 | 1 | < 0.1% | |
| -20 | 1 | < 0.1% | |
| -15.45 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 104519.54 | 1 | < 0.1% | |
| 97740.99 | 1 | < 0.1% | |
| 92523.94 | 1 | < 0.1% | |
| 89121.94 | 1 | < 0.1% | |
| 82881.16 | 1 | < 0.1% |
| Distinct count | 2885 |
|---|---|
| Unique (%) | 79.9% |
| Missing | 4577 |
| Missing (%) | 55.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1760.1001799058954 |
|---|---|
| Minimum | -179.26 |
| Maximum | 149483.31 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 64.1 KiB |
Quantile statistics
| Minimum | -179.26 |
|---|---|
| 5-th percentile | 0.782 |
| Q1 | 6.6 |
| median | 36.26 |
| Q3 | 163.15 |
| 95-th percentile | 1159.758 |
| Maximum | 149483.31 |
| Range | 149662.57 |
| Interquartile range (IQR) | 156.55 |
Descriptive statistics
| Standard deviation | 11276.46221 |
|---|---|
| Coefficient of variation (CV) | 6.406716127 |
| Kurtosis | 72.06807509 |
| Mean | 1760.10018 |
| Median Absolute Deviation (MAD) | 34.16 |
| Skewness | 8.133805548 |
| Sum | 6359241.95 |
| Variance | 127158599.9 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 1 | 17 | 0.2% | |
| 3 | 15 | 0.2% | |
| 2 | 15 | 0.2% | |
| 6 | 14 | 0.2% | |
| 0.6 | 12 | 0.1% | |
| 4 | 11 | 0.1% | |
| 1.2 | 10 | 0.1% | |
| 0.24 | 9 | 0.1% | |
| 0.5 | 9 | 0.1% | |
| 0.3 | 9 | 0.1% | |
| Other values (2875) | 3492 | 42.6% | |
| (Missing) | 4577 | 55.9% |
| Value | Count | Frequency (%) | |
| -179.26 | 1 | < 0.1% | |
| -89.1 | 1 | < 0.1% | |
| -44.54 | 1 | < 0.1% | |
| -29.1 | 1 | < 0.1% | |
| -23.97 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 149483.31 | 1 | < 0.1% | |
| 146394.44 | 1 | < 0.1% | |
| 141630.61 | 1 | < 0.1% | |
| 139621.51 | 1 | < 0.1% | |
| 130129.11 | 1 | < 0.1% |
| Distinct count | 3405 |
|---|---|
| Unique (%) | 98.3% |
| Missing | 4726 |
| Missing (%) | 57.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3292.9358862586605 |
|---|---|
| Minimum | 0.22 |
| Maximum | 67474.85 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 64.1 KiB |
Quantile statistics
| Minimum | 0.22 |
|---|---|
| 5-th percentile | 18.4695 |
| Q1 | 304.6875 |
| median | 1176.425 |
| Q3 | 3310.0075 |
| 95-th percentile | 12863.771 |
| Maximum | 67474.85 |
| Range | 67474.63 |
| Interquartile range (IQR) | 3005.32 |
Descriptive statistics
| Standard deviation | 6792.329861 |
|---|---|
| Coefficient of variation (CV) | 2.06269727 |
| Kurtosis | 29.00029382 |
| Mean | 3292.935886 |
| Median Absolute Deviation (MAD) | 1070.015 |
| Skewness | 4.864484796 |
| Sum | 11406729.91 |
| Variance | 46135744.95 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 3 | 5 | 0.1% | |
| 2 | 4 | < 0.1% | |
| 2.5 | 4 | < 0.1% | |
| 4 | 4 | < 0.1% | |
| 9 | 4 | < 0.1% | |
| 2.61 | 4 | < 0.1% | |
| 3.97 | 3 | < 0.1% | |
| 8 | 3 | < 0.1% | |
| 0.63 | 3 | < 0.1% | |
| 12 | 3 | < 0.1% | |
| Other values (3395) | 3427 | 41.8% | |
| (Missing) | 4726 | 57.7% |
| Value | Count | Frequency (%) | |
| 0.22 | 2 | < 0.1% | |
| 0.41 | 1 | < 0.1% | |
| 0.46 | 1 | < 0.1% | |
| 0.63 | 3 | < 0.1% | |
| 0.66 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 67474.85 | 1 | < 0.1% | |
| 65344.64 | 1 | < 0.1% | |
| 63830.91 | 1 | < 0.1% | |
| 63130.81 | 1 | < 0.1% | |
| 60065.82 | 1 | < 0.1% |
| Distinct count | 4045 |
|---|---|
| Unique (%) | 99.9% |
| Missing | 4140 |
| Missing (%) | 50.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4132.216422222222 |
|---|---|
| Minimum | -185.17 |
| Maximum | 771448.1 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 64.1 KiB |
Quantile statistics
| Minimum | -185.17 |
|---|---|
| 5-th percentile | 577.679 |
| Q1 | 1440.8275 |
| median | 2727.135 |
| Q3 | 4832.555 |
| 95-th percentile | 10227.8585 |
| Maximum | 771448.1 |
| Range | 771633.27 |
| Interquartile range (IQR) | 3391.7275 |
Descriptive statistics
| Standard deviation | 13086.69028 |
|---|---|
| Coefficient of variation (CV) | 3.16699053 |
| Kurtosis | 2923.05653 |
| Mean | 4132.216422 |
| Median Absolute Deviation (MAD) | 1482.82 |
| Skewness | 50.2778242 |
| Sum | 16735476.51 |
| Variance | 171261462.4 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 3113.78 | 2 | < 0.1% | |
| 986.23 | 2 | < 0.1% | |
| 1327.97 | 2 | < 0.1% | |
| 2743.18 | 2 | < 0.1% | |
| 1064.56 | 2 | < 0.1% | |
| 2248.72 | 1 | < 0.1% | |
| 1044.74 | 1 | < 0.1% | |
| 3154.77 | 1 | < 0.1% | |
| 1756.07 | 1 | < 0.1% | |
| 6207.39 | 1 | < 0.1% | |
| Other values (4035) | 4035 | 49.3% | |
| (Missing) | 4140 | 50.5% |
| Value | Count | Frequency (%) | |
| -185.17 | 1 | < 0.1% | |
| -37.02 | 1 | < 0.1% | |
| 40.98 | 1 | < 0.1% | |
| 60.92 | 1 | < 0.1% | |
| 114.25 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 771448.1 | 1 | < 0.1% | |
| 108519.28 | 1 | < 0.1% | |
| 105223.11 | 1 | < 0.1% | |
| 85851.87 | 1 | < 0.1% | |
| 63005.58 | 1 | < 0.1% |
| Distinct count | 2505 |
|---|---|
| Unique (%) | 32.9% |
| Missing | 585 |
| Missing (%) | 7.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 172.46080918276135 |
|---|---|
| Minimum | 126.064 |
| Maximum | 228.9764563 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 64.1 KiB |
Quantile statistics
| Minimum | 126.064 |
|---|---|
| 5-th percentile | 126.5621 |
| Q1 | 132.3648387 |
| median | 182.7640032 |
| Q3 | 213.9324122 |
| 95-th percentile | 223.8693849 |
| Maximum | 228.9764563 |
| Range | 102.9124563 |
| Interquartile range (IQR) | 81.5675735 |
Descriptive statistics
| Standard deviation | 39.7383461 |
|---|---|
| Coefficient of variation (CV) | 0.2304195735 |
| Kurtosis | -1.832113304 |
| Mean | 172.4608092 |
| Median Absolute Deviation (MAD) | 42.0385282 |
| Skewness | 0.06766805636 |
| Sum | 1311564.454 |
| Variance | 1579.136151 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 132.7160968 | 33 | 0.4% | |
| 139.1226129 | 24 | 0.3% | |
| 201.0705712 | 12 | 0.1% | |
| 224.8025314 | 12 | 0.1% | |
| 130.683 | 11 | 0.1% | |
| 129.7706452 | 11 | 0.1% | |
| 132.4668065 | 11 | 0.1% | |
| 130.737871 | 11 | 0.1% | |
| 126.2085484 | 11 | 0.1% | |
| 129.8364 | 11 | 0.1% | |
| Other values (2495) | 7458 | 91.1% | |
| (Missing) | 585 | 7.1% |
| Value | Count | Frequency (%) | |
| 126.064 | 11 | 0.1% | |
| 126.0766452 | 11 | 0.1% | |
| 126.0854516 | 11 | 0.1% | |
| 126.0892903 | 11 | 0.1% | |
| 126.1019355 | 11 | 0.1% |
| Value | Count | Frequency (%) | |
| 228.9764563 | 3 | < 0.1% | |
| 228.8892482 | 1 | < 0.1% | |
| 228.8020401 | 1 | < 0.1% | |
| 228.7796682 | 3 | < 0.1% | |
| 228.7298638 | 6 | 0.1% |
| Distinct count | 404 |
|---|---|
| Unique (%) | 5.3% |
| Missing | 585 |
| Missing (%) | 7.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.8268210387902695 |
|---|---|
| Minimum | 3.6839999999999997 |
| Maximum | 14.312999999999999 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 64.1 KiB |
Quantile statistics
| Minimum | 3.684 |
|---|---|
| 5-th percentile | 5.143 |
| Q1 | 6.634 |
| median | 7.806 |
| Q3 | 8.567 |
| 95-th percentile | 10.926 |
| Maximum | 14.313 |
| Range | 10.629 |
| Interquartile range (IQR) | 1.933 |
Descriptive statistics
| Standard deviation | 1.877258594 |
|---|---|
| Coefficient of variation (CV) | 0.2398494337 |
| Kurtosis | 2.498221012 |
| Mean | 7.826821039 |
| Median Absolute Deviation (MAD) | 0.915 |
| Skewness | 1.067685459 |
| Sum | 59522.974 |
| Variance | 3.524099828 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 8.099 | 78 | 1.0% | |
| 7.852 | 56 | 0.7% | |
| 8.163 | 56 | 0.7% | |
| 8.625 | 54 | 0.7% | |
| 7.057 | 52 | 0.6% | |
| 7.441 | 52 | 0.6% | |
| 6.565 | 52 | 0.6% | |
| 7.931 | 52 | 0.6% | |
| 8.2 | 52 | 0.6% | |
| 6.891 | 52 | 0.6% | |
| Other values (394) | 7049 | 86.1% | |
| (Missing) | 585 | 7.1% |
| Value | Count | Frequency (%) | |
| 3.684 | 8 | 0.1% | |
| 3.879 | 13 | 0.2% | |
| 3.896 | 4 | < 0.1% | |
| 3.921 | 13 | 0.2% | |
| 3.932 | 26 | 0.3% |
| Value | Count | Frequency (%) | |
| 14.313 | 42 | 0.5% | |
| 14.18 | 39 | 0.5% | |
| 14.099 | 39 | 0.5% | |
| 14.021 | 36 | 0.4% | |
| 13.975 | 24 | 0.3% |
IsHoliday
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.1 KiB |
| False | |
|---|---|
| True | 585 |
| Value | Count | Frequency (%) | |
| False | 7605 | 92.9% | |
| True | 585 | 7.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| Store | Date | Temperature | Fuel_Price | MarkDown1 | MarkDown2 | MarkDown3 | MarkDown4 | MarkDown5 | CPI | Unemployment | IsHoliday | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 05/02/2010 | 42.31 | 2.572 | NaN | NaN | NaN | NaN | NaN | 211.096358 | 8.106 | False |
| 1 | 1 | 12/02/2010 | 38.51 | 2.548 | NaN | NaN | NaN | NaN | NaN | 211.242170 | 8.106 | True |
| 2 | 1 | 19/02/2010 | 39.93 | 2.514 | NaN | NaN | NaN | NaN | NaN | 211.289143 | 8.106 | False |
| 3 | 1 | 26/02/2010 | 46.63 | 2.561 | NaN | NaN | NaN | NaN | NaN | 211.319643 | 8.106 | False |
| 4 | 1 | 05/03/2010 | 46.50 | 2.625 | NaN | NaN | NaN | NaN | NaN | 211.350143 | 8.106 | False |
| 5 | 1 | 12/03/2010 | 57.79 | 2.667 | NaN | NaN | NaN | NaN | NaN | 211.380643 | 8.106 | False |
| 6 | 1 | 19/03/2010 | 54.58 | 2.720 | NaN | NaN | NaN | NaN | NaN | 211.215635 | 8.106 | False |
| 7 | 1 | 26/03/2010 | 51.45 | 2.732 | NaN | NaN | NaN | NaN | NaN | 211.018042 | 8.106 | False |
| 8 | 1 | 02/04/2010 | 62.27 | 2.719 | NaN | NaN | NaN | NaN | NaN | 210.820450 | 7.808 | False |
| 9 | 1 | 09/04/2010 | 65.86 | 2.770 | NaN | NaN | NaN | NaN | NaN | 210.622857 | 7.808 | False |
Last rows
| Store | Date | Temperature | Fuel_Price | MarkDown1 | MarkDown2 | MarkDown3 | MarkDown4 | MarkDown5 | CPI | Unemployment | IsHoliday | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 8180 | 45 | 24/05/2013 | 67.11 | 3.627 | 3249.34 | 481.82 | 58.48 | 1183.23 | 1309.30 | NaN | NaN | False |
| 8181 | 45 | 31/05/2013 | 65.88 | 3.646 | 6474.49 | 411.38 | 77.06 | 9.38 | 4227.27 | NaN | NaN | False |
| 8182 | 45 | 07/06/2013 | 70.71 | 3.633 | 9977.82 | 744.29 | 80.00 | 4825.71 | 3597.34 | NaN | NaN | False |
| 8183 | 45 | 14/06/2013 | 70.01 | 3.632 | 2471.44 | 517.87 | 348.54 | 2612.33 | 3459.39 | NaN | NaN | False |
| 8184 | 45 | 21/06/2013 | 70.13 | 3.626 | 4989.34 | 385.31 | 178.56 | 2463.42 | 3117.94 | NaN | NaN | False |
| 8185 | 45 | 28/06/2013 | 76.05 | 3.639 | 4842.29 | 975.03 | 3.00 | 2449.97 | 3169.69 | NaN | NaN | False |
| 8186 | 45 | 05/07/2013 | 77.50 | 3.614 | 9090.48 | 2268.58 | 582.74 | 5797.47 | 1514.93 | NaN | NaN | False |
| 8187 | 45 | 12/07/2013 | 79.37 | 3.614 | 3789.94 | 1827.31 | 85.72 | 744.84 | 2150.36 | NaN | NaN | False |
| 8188 | 45 | 19/07/2013 | 82.84 | 3.737 | 2961.49 | 1047.07 | 204.19 | 363.00 | 1059.46 | NaN | NaN | False |
| 8189 | 45 | 26/07/2013 | 76.06 | 3.804 | 212.02 | 851.73 | 2.06 | 10.88 | 1864.57 | NaN | NaN | False |